Parallel Data Cube Construction: Algorithms, Theoretical Analysis, and Experimental Evaluation

نویسندگان

Ruoming Jin

Ge Yang

Gagan Agrawal

چکیده

Data cube construction is a commonly used operation in data warehouses. Because of the volume of data that is stored and analyzed in a data warehouse and the amount of computation involved in data cube construction, it is natural to consider parallel machines for this operation. This paper presents two new algorithms for parallel data cube construction, along with their theoretical analysis and experimental evaluation. Our work is based upon a new data-structure, called the aggregation tree, which results in minimally bounded memory requirements. An aggregation tree is parameterized by the ordering of dimensions. We prove that the same ordering of the dimensions minimizes both the computational and communication requirements, for both the algorithms. We also describe a method for partitioning the initial array, which again minimizes the communication volume for both the algorithms. Experimental results further validate the theoretical results.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Impact of Data Distribution, Level of Parallelism, and Communication Frequency on Parallel Data Cube Construction

متن کامل

Parallel Construction of Data Cubes on Multi-Core Multi-Disk Platforms

On-line Analytical Processing (OLAP) has become one of the most powerful and prominent technologies for knowledge discovery in VLDB (Very Large Database) environments. Central to the OLAP paradigm is the data cube, a multi dimensional hierarchy of aggregate values that provides a rich analytical model for decision support. Various sequential algorithms for the efficient generation of the data c...

متن کامل

Computing Partial Data Cubes ∗

The precomputation of the different views of a data cube is critical to improving the response time of data cube queries for On-Line Analytical Processing (OLAP). However, the user is often not interested in the set of all views of the data cube but only in a certain subset of views. In this paper, we study the problem of computing the partial data cube, i.e. a subset of selected views in the l...

متن کامل

Cube-Lifecycle Management and Applications

A common operation involved with the majority of algorithms relevant to On-Line Analytical Processing is aggregation, which can be extremely time-consuming if applied over large datasets. To overcome this drawback, scientists have proposed the precomputation and materialization of a large volume of aggregated data into a structure called data cube. Nevertheless, the construction and usage of th...

متن کامل

Parallel data cube construction for high performance on-line analytical processing

Decision support systems use On-Line Analytical Processing (OLAP) to analyze data by posing complex queries that require diierent views of data. Traditionally , a relational approach (ROLAP) has been taken to build such systems. More recently, multi-dimensional database techniques (MOLAP) have been applied to decision-support applications. Data is stored in multi-dimensional arrays which is a n...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2003

Parallel Data Cube Construction: Algorithms, Theoretical Analysis, and Experimental Evaluation

نویسندگان

چکیده

منابع مشابه

Impact of Data Distribution, Level of Parallelism, and Communication Frequency on Parallel Data Cube Construction

Parallel Construction of Data Cubes on Multi-Core Multi-Disk Platforms

Computing Partial Data Cubes ∗

Cube-Lifecycle Management and Applications

Parallel data cube construction for high performance on-line analytical processing

عنوان ژورنال:

اشتراک گذاری